Multicycle Broadcast Bypass: Too Readily Overlooked

نویسندگان

  • Peter G. Sassone
  • D. Scott Wills
چکیده

The bypass path, also called the forwarding path, allows processors to broadcast operands from one functional unit to another more quickly than through the register file. In modern superscalar out-of-order CPUs bypass is part of the execution pipeline stage, allowing dependant instructions to issue on subsequent cycles. In these modern machines, however, the bypass network complexity is becoming a limiting factor in frequency scaling. Traditionally, architects have been unwilling to separate the execute and bypass into different stages for fear of huge IPC losses. Through cycle-time calculations and cycleaccurate simulation with Spec2000int and Mediabench, though, we show that multicycle broadcast bypass is a simple and beneficial design choice. By allowing bypassed values multiple cycles to reach their destination, processor frequency can be increased more than IPC decreases. This solution involves no repeaters, no instruction steering, and no complex control logic. At 90nm, instruction throughput increases by 9% by separating the bypass into a separate stage on a four-wide machine, and throughput increases by 16% by adding two bypass stages to an eight-wide machine.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application-Bypass Broadcast in MPICH over GM

Processes of a parallel program can become unsynchronized, or skewed, during the course of running an application. Processes can become skewed as a result of unbalanced or asymmetric code, or through the use of heterogeneous systems, where nodes in the system have different performance characteristics, as well as random, unpredictable effects such as the processes not being started at exactly t...

متن کامل

Application-Bypas Broadcast in MPICH over GM

Processes of a parallel program can become unsynchronized, or skewed, during the course of running an application. Processes can become skewed as a result of unbalanced or asymmetric code, or through the use of heterogeneous systems, where nodes in the system have different performance characteristics, as well as random, unpredictable effects such as the processes not being started at exactly t...

متن کامل

Multicycle Polling Scheduling Algorithm

The paper deals with the scheduling of periodic information flow in FieldBus environment. The scheduling problem is defined from an analytical point of view, giving a brief survey of the most well-known solutions, and illustrating multicycle polling scheduling which is based on the hypothesis that all the production period of the periodic processes to be scheduled are harmonic. Although in some...

متن کامل

A comparison of laboratory findings in coronary artery bypass surgery with and without cardiopulmonary bypass

Background : Quests for doing coronary artery bypass surgery by a technique with lower complications is going on, for this aim many studies compared patients undergoing CABG with or without cardiopulmonary bypass. This study was carried out to compare laboratory findings after coronary artery bypass in these two groups of patients. Materials and Methods: In a retrospective study, 167 patients ...

متن کامل

Demand-Only Broadcast: Reducing Register File and Bypass Power in Clustered Execution Cores

This paper introduces a technique called Demand-Only Broadcast that reduces the power consumption of the register file and result bypass network in a clustered execution core. With this technique, an instruction’s result is only broadcast within remote clusters if it is needed by dependants in those clusters. Demand-Only Broadcast was evaluated using a performance–power simulator of a high-perf...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004